KANIS: Preserving k-Anonymity Over Distributed Data
نویسندگان
چکیده
In this paper we describe KANIS, a distributed system designed to preserve the privacy of multidimensional, hierarchical data that are dispersed over a network. While allowing for efficient storing, indexing and querying of the data, our system employs an adaptive scheme that automatically adjusts the level of indexing according to the privacy constrains: Efficient roll-up and drill-down operations take place in order to guarantee k-anonymity while minimizing data distortion and inconsistency. Thus, our system manages to maintain k-anonymity of the published data in a distributed and on-line manner even under frequent updates, without affecting its ability to efficiently answer queries. The initial experimental evaluation of our prototype shows that KANIS manages to preserve k-anonymity while improving the data quality up to 22% compared to a popular centralized global recoding algorithm. It achieves a near-optimal distortion regardless of the network or dataset size, with a reasonable communication overhead, scattered among the participating nodes.
منابع مشابه
Privacy-Preserving Distributed k-Anonymity
k-anonymity provides a measure of privacy protection by preventing re-identification of data to fewer than a group of k data items. While algorithms exist for producing k-anonymous data, the model has been that of a single source wanting to publish data. This paper presents a k-anonymity protocol when the data is vertically partitioned between sites. A key contribution is a proof that the proto...
متن کاملParallelizing K-Anonymity Algorithm for Privacy Preserving Knowledge Discovery from Big Data
Disclosure control has become inevitable as privacy is given paramount importance while publishing data for mining. The data mining community enjoyed revival after Samarti and Sweeney proposed k-anonymization for privacy preserving data mining. The k-anonymity has gained high popularity in research circles. Though it has some drawbacks and other PPDM algorithms such as l-diversity, t-closeness ...
متن کاملState-of-art in Statistical Anonymization Techniques for Privacy Preserving Data Mining
With the increased and vast use of online data, security in data mining has now become very important. Anonymity techniques have proved very useful in distributed computation. More techniques are still under research and improvements for achieving higher level of security in sensitive data. In this paper, we provide a review of the statistical Anonymization methods that can be applied for priva...
متن کاملOn k-Anonymity and the Curse of Dimensionality
In recent years, the wide availability of personal data has made the problem of privacy preserving data mining an important one. A number of methods have recently been proposed for privacy preserving data mining of multidimensional data records. One of the methods for privacy preserving data mining is that of anonymization, in which a record is released only if it is indistinguishable from k ot...
متن کاملA Novel Anonymity Algorithm for Privacy Preserving in Publishing Multiple Sensitive Attributes
Publishing the data with multiple sensitive attributes brings us greater challenge than publishing the data with single sensitive attribute in the area of privacy preserving. In this study, we propose a novel privacy preserving model based on k-anonymity called (α, β, k)-anonymity for databases. (α, β, k)anonymity can be used to protect data with multiple sensitive attributes in data publishing...
متن کامل